Adding a Rule
To add a rule, go to the Rules page. There are two ways to access the Rules page in Collibra DQ:
- From the left navigation bar.
- From the findings page.
To access the Rules page from the left navigation bar, click the wrench icon and then Rule Builder. From the Rule Builder page, select a data set and a rule type.
Instructions
- Search for a data set or navigate to the Rule Builder page in the left navigation panel.
- Rules can only be applied to data sets once a DQ job runs once
- Click Load.
- The schema and any previously saved rules populate.
- Select a rule type with the dropdown next to the Type label
- Select a rule name
- If applying a preset rule, the rule name will be auto populated
- Input a rule condition
- Only if applying a simple, freeform sql, stat, or native rule type.
- Provide a value in the condition/sql/function input field.
- Keystroke Ctrl+Space provides IntelliSense.
- Select Low, Medium or High for scoring severity (optional).
- Add any custom DQ dimensions for reporting (optional).
- Click submit to save the rule.
The rule is measured on the next DQ job run for that particular data set.
Rule Types
Rule type | Description | Example |
---|---|---|
Simple rules | Simple rules are used when you want to filter a condition on a single column in a single table. | City = 'Baltimore' |
Freeform SQL rules | Freeform SQL rules are used when you want to apply a condition across multiple tables/columns and generally when more flexibility or customization is desired. | select * from dataset where name = 'Collibra' |
Preset rules | Preset rules are used for quickly adding strict condition checks. Commonly used conditions are available to add to any data set columns. |
All built-in Spark functions are available to use. Visit https://spark.apache.org/docs/2.3.0/api/sql/ for simple and freeform sql rules.
Points and Percentage
For every percentage the x condition occurs, deduct y points from the data quality score. If a rule was triggered 10 times out of 100 rows, break records occurred 10% of the time. If you input 1 point for every 1 percent, 10 points would be deducted from the overall score.
Creating Your First Rule
Let’s create a simple rule using the below information. The data set name.
- Search for “shape_example” and click “Load”
- Select “Simple Rule”
- Rule Name = lnametest
- @shape_example.lname = “hootbeck” (should hit one time day over day).
- Points = 1
- Percentage = 1
- Click “Submit”
Once the rule has been submitted please find the below list of rules with the new rule we just defined as shown below.
Using the Rules tab
Rule scores will appear under the Rules tab on the findings page. You can also see more details in the bottom panel of the Rules page under the Rules and Results tabs.
Click the plus icon next to the Rule Name to drill into any available rule. The following table describes the columns when you drill into a rule:
Column | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
dataset | The dataset to which the rule applies. | ||||||||||
ruleNm | The unique name of the rule. | ||||||||||
ruleValue | The column of your table, view, or schema the rule queries against. | ||||||||||
modType |
There are four possible values that signify the audit sequence of a rule:
|
||||||||||
updtTs | The timestamp of the last run of a rule. | ||||||||||
isActive | The binary value of whether a rule is active or not. Active rules have values of 1 and inactive rules have values of 0. | ||||||||||
userNm | The username of the user who generated the rule. |